Fully computer generated artwork has been an area of increasing interest in the NFT space. Various collections have been created that feature artwork exclusively generated by GANs, such as GAN Apes [13], and GAN Nature [12]. Such collections have not been met with widespread success due to the heterogeneity of the resulting artwork, as well as the lack of any discernible properties or features. These shortcomings have impaired their ability to be particularly useful as “identifiers” as discussed above. In this project, we aimed to create a system that could automate and facilitate the creation of generative art pieces that could supplement the limited set of artwork released within a said collection. The final artwork should match the art style and aesthetic, and be largely indistinguishable, from the larger collection. It should also reflect the features corresponding to a set of user-defined input “properties”.
We have chosen to use the Bored Apes Yacht Club dataset (link: https://www.kaggle.com/stanleyjzheng/bored-apes-yacht-club), due in part to its high usability value as well as the dimensions of all the files in the dataset. sThe 1.46GB dataset contains 10,000 63x631 images that feature a homegenous art style. For preprocessing, we needed to resize all the images to 256x256 in order to comply with the StyleGAN2 architectural requirements. We also needed to reduce the number of channels from 4 to 3, by converting the images from RGBA to RGB.
The most recent contribution to literature that tackled the generation of NFT artwork through deep machine learning was NFTGan, which was based upon the style-transfer model StyleGAN2.
NFTGan was trained on a dataset of 2283 images that were resized to 512x512 for compatibility with StyleGAN2, using a Pytorch implementation of StyleGAN2 was then trained on the data set for 59 hours on a NVIDIA Tesla P100 GPU with 16 GB memory.
StyleGAN was conceived out of the need for more finely grained control over image generation applications using generative adversarial networks. To that end, changes were introduced for a redesigned architecture for the generation module in order to improve image generation quality, which no significant upgrades being proposed for either the discriminator or the loss function. Our main focus in this project is StyleGAN's replacement of the traditional input layer with a nonlinear multi-level perceptron feed forward network that maps the latent space into vector space W instead of directly feeding the vector z into the generator.
Our first experiment was to emulate the LoGAN model into StyleGAN 2, which involved concatenating the unprocessed labels with the noised latent space vector. The concatenated vector is then passed into a multi-layer mapping network for disentanglment
Our second proposed model was to use the conditions system newly implemented into StyleGAN. This involved embedding the labels vector using a single dense layer, after which the processed labels are then concatenated with the noised latent space vector. The concatenated vector is then passed into a multi-layer mapping network for disentanglment
Our third experiment was to pass the labels vector over a deep multi-layer embedding network. The processed labels are then concatenated with the noised latent space vector. The concatenated vector is then passed into a multi-layer mapping network for disentanglment.
Our final experiment was to skip the embedding altogether. The unprocessed labels are concatenated directly with the latent vector after its been through the disentanglement mapping network
As all potential datasets were unlabelled, we proposed and created a scrapper to scrape the labels for the dataset needed
Our conclusions are two folds. We firmly believe that our results act as a good counterargument against qualitative metrics, including FID, for the evaluation of Generative Adverserial Networks. Our FID results for both experiments indicated no improve in the performance of the generator from inception till hour 96 of training. Regardless, our produced images showed close proximity to the optimal anticipated results. We also believe that our results demonstrated that the LoGAN proposal for conditional generation is superior vis-a-vis to the latest StyleGAN proposal.